Testing for Association between Categorical Variables with Multiple-Response Data

نویسنده

  • Dan Nettleton
چکیده

Some survey questions provide respondents with a list of possible answers and instructions to "mark all that apply." This paper presents methods for determining when the responses to a mark-allthatapply question are associated with the responses to a standard question whose answers fall in one of several mutually exclusive categories. Such data originate from many sources including large-scale social, political, and economic surveys; work place studies; market research; and health care analyses among others. Recent approaches to the problem are briefly reviewed, a related alternative procedure proposed, and extensions that allow for the proper analysis of multiple-response data collected through complex sampling are suggested. 1 I n t r o d u c t i o n Loughin and Scherer (1998) consider testing for association with multiple-response data. They study a survey of 262 Kansas livestock farmers who were asked to specify their education level (1. high school or less, 2. vocational school, 3. two-year college, 4. four-year college, or 5. other) and primary sources of veterinary information (1. professional consultant, 2. veterinarian, 3. state or local extension service, 4. magazines, and 5. feed company representatives). The 262 f a r m e r s who were instructed to mark one education level and as many information sources as appl icableprovided 453 responses to the veterinary information question. The survey researchers were interested in determining if the proportion of farmers using each information source is constant across varying levels of education. In general, suppose X is a categorical random variable with I levels and Y (Y1, . . . , yj)! is a random vector of binary responses, i.e., Yj E {0, 1} Partial funding provided by the Gallup Research Center for all j = 1 , . . . , J. Suppose we have n independent observations, {(Xk, Y~)' (Xk, Yk l , . . . , YkJ)')k=l .... ,,~, from the joint distribution of X and Y. The vector Y can take any of 2 J values, in general, or any of 2 J 1 values when ( 0 , 0 , . . . , 0) / is not a valid response. For i = 1 , . . . , I and j = 1 , . . . , J; let mij denote the number of observations for which X = i and Yj = 1. In the survey of livestock farmers, I = 5, J = 5, Xk takes value i if the kth farmer reports education level i, and Ykj would be coded as 1 if the jth information source is selected by farmer k and 0 otherwise. The count ~Zij is simply the number of farmers who report education level i and indicate information source j among their sources of veterinary information. The complete livestock farmer survey data can be found in Loughin and Scherer (1998). Loughin and Scherer consider testing J Hoj against Hi " U j J _ = I H l j (1) H0 -('lj= 1 where, for j = 1 , . . . , J; Hoj: P ( X = i, Yj = 1 ) = P ( X = i)P(]~ = 1) for all i = 1 , . . . , I and Hlj : P ( X = i, Yj = 1) # P ( X = i)P(Yj = 1) for some i. The null hypothesis H0 is equivalent to "X is independent of Yj for all j = 1 , . . . , J." Agresti and Liu (1999) refer to this null hypothesis as the hypothesis of multiple marginal independence which is conveniently abbreviated MMI. The alternative hypothesis H1 is equivalent to "X and Yj are associated for at least one j E { 1 , . . . , J}." Recent approaches to testing MMI are sketched in Section 2. As indicated by Agresti and Liu (1999), a problem common to many of these proposals is a lack of invariance to the coding of 0 and 1 values in the components of Y. For example, it is possible for a procedure to suggest a departure from MMI when positive responses are coded as 1 but agreement with MMI when positive responses are coded as 0. This

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attitude of Health Care Professionals Towards Voluntary Counseling and Testing for HIV/AIDS

Introduction: HIV counseling and testing is the vital and preliminary interventional step aimed at reducing the spread of HIV infection. The study was designed to determine the attitude of health care professionals towards voluntary counseling and testing (VCT) for HIV/AIDS at Irrua Specialist Teaching Hospital. Materials & Methods: In this descriptive cross sectional prospective study a sel...

متن کامل

MRCV: A Package for Analyzing Categorical Variables with Multiple Response Options

Multiple response categorical variables (MRCVs), also known as “pick any” or “choose all that apply” variables, summarize survey questions for which respondents are allowed to select more than one category response option. Traditional methods for analyzing the association between categorical variables are not appropriate with MRCVs due to the within-subject dependence among responses. We have d...

متن کامل

Testing for Simultaneous Pairwise Marginal Independence

1 Introduction What types of cars do you own? What are your sources of veterinary information? For what criminal offenses have you been arrested? These are all example questions appearing on surveys where the respondent is prompted to pick any number of responses from a set of variables that summarize this type of survey data have been called multiple-response (or pick any/c) categorical variab...

متن کامل

Determinants of Inflation in Selected Countries

This paper focuses on developing models to study influential factors on the inflation rate for a panel of available countries in the World Bank data base during 2008-2012‎. ‎For this purpose‎, Random effect log-linear and Ordinal logistic models are used for the analysis of continuous and categorical inflation rate variables‎. ‎As the original inflation rate response to variables shows an appar...

متن کامل

سری آمار: تحلیل جداول توافقی 2 (شاخص‌های بررسی رابطه)

The P-Value cannot present a complete measure of association in medical studies considering the association between categorical variables. In such situations, measures are required to reveal the clinical importance of relation along with their statistical significance, as the effect size. This paper aims to introduce the measures of associations for categorical variables and inferences ab...

متن کامل

Partial Association Components in Multi-way Contingency Tables and Their Statistiical Analysis

In analyses of contingency tables made up of categorical variables, the study of relationship between the variables is usually the major objective. So far, many association measures and association models have been used to measure  the association structure present in the table. Although the association measures merely determine the degree of strength of association between the study varia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002